Cluster validation using information stability measures

نویسندگان

  • Damaris Pascual
  • Filiberto Pla
  • José Salvador Sánchez
چکیده

0167-8655/$ see front matter 2009 Elsevier B.V. A doi:10.1016/j.patrec.2009.07.009 * Corresponding author. Fax: +34 964 728435. E-mail addresses: [email protected] (D. Pa [email protected] (J.S. Sánchez). In this work, a novel technique to address the problem of cluster validation based on cluster stability properties is presented. The stability index here proposed is based on the variation on some information measures over the partitions generated by a given clustering model due to the variability in clustering solutions produced by different sample sets. 2009 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cluster Stability for Finite Samples

Over the past few years, the notion of stability in data clustering has received growing attention as a cluster validation criterion in a sample-based framework. However, recent work has shown that as the sample size increases, any clustering model will usually become asymptotically stable. This led to the conclusion that stability is lacking as a theoretical and practical tool. The discrepancy...

متن کامل

Cluster Validity Measures Dynamic Clustering Algorithms

Cluster analysis finds its place in many applications especially in data analysis, image processing, pattern recognition, market research by grouping customers based on purchasing pattern, classifying documents on web for information discovery, outlier detection applications and act as a tool to gain insight into the distribution of data to observe characteristics of each cluster. This ensures ...

متن کامل

A Resampling Approach to Cluster Validation

The concept of cluster stability is introduced as a means for assessing the validity of data partitionings found by clustering algorithms. It allows us to explicitly quantify the quality of a clustering solution, without being dependent on external information. The principle of maximizing the cluster stability can be interpreted as choosing the most self-consistent data partitioning. We present...

متن کامل

clValid , an R package for cluster validation

The R package clValid contains functions for validating the results of a clustering analysis. There are three main types of cluster validation measures available, “internal”, “stability”, and “biological”. The user can choose from nine clustering algorithms in existing R packages, including hierarchical, K-means, self-organizing maps (SOM), and model based clustering. In addition, we provide a ...

متن کامل

A New Asymmetric Criterion for Cluster Validation

In this paper a new criterion for clusters validation is proposed. Many stability measures to validate a cluster have been proposed such as Normalized Mutual Information. We propose a new criterion for clusters validation. The drawback of the common approach is discussed in this paper and then a new asymmetric criterion is proposed to assess the association between a cluster and a partition whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2010